NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

DocParseNet: Advanced Semantic Segmentation and OCR Embeddings for Efficient Scanned Document Annotation

Mohammadshirazi, Ahmad; AFiroozsalari, li Nosrati; Zhou, Mengxi; Kulshrestha, Dheeraj; Ramnath, Rajiv (July 2024, https://doi.org/10.48550/arXiv.2406.17591)

Automating the annotation of scanned documents is challenging, requiring a balance between computational efficiency and accuracy. DocParseNet addresses this by combining deep learning and multi-modal learning to process both text and visual data. This model goes beyond traditional OCR and semantic segmentation, capturing the interplay between text and images to preserve contextual nuances in complex document structures. Our evaluations show that DocParseNet significantly outperforms conventional models, achieving mIoU scores of 49.12 on validation and 49.78 on the test set. This reflects a 58% accuracy improvement over state-of-the-art baseline models and an 18% gain compared to the UNext baseline. Remarkably, DocParseNet achieves these results with only 2.8 million parameters, reducing the model size by approximately 25 times and speeding up training by 5 times compared to other models. These metrics, coupled with a computational efficiency of 0.039 TFLOPs (BS=1), highlight DocParseNet's high performance in document annotation. The model's adaptability and scalability make it well-suited for real-world corporate document processing applications.
more » « less
Full Text Available
Novel Physics-Based Machine-Learning Models for Indoor Air Quality Approximations

Mohammadshirazi, Ahmad; Nadafian, Aida; Monsefi, Amin Karimi; Rafiei, Mohammad; Ramnath, Rajiv (August 2023, https://doi.org/10.48550/arXiv.2308.01438)

Cost-effective sensors are capable of real-time capturing a variety of air quality-related modalities from different pollutant concentrations to indoor/outdoor humidity and temperature. Machine learning (ML) models are capable of performing air-quality "ahead-of-time" approximations. Undoubtedly, accurate indoor air quality approximation significantly helps provide a healthy indoor environment, optimize associated energy consumption, and offer human comfort. However, it is crucial to design an ML architecture to capture the domain knowledge, so-called problem physics. In this study, we propose six novel physics-based ML models for accurate indoor pollutant concentration approximations. The proposed models include an adroit combination of state-space concepts in physics, Gated Recurrent Units, and Decomposition techniques. The proposed models were illustrated using data collected from five offices in a commercial building in California. The proposed models are shown to be less complex, computationally more efficient, and more accurate than similar state-of-the-art transformer-based models. The superiority of the proposed models is due to their relatively light architecture (computational efficiency) and, more importantly, their ability to capture the underlying highly nonlinear patterns embedded in the often contaminated sensor-collected indoor air quality temporal data.
more » « less
Full Text Available
Predicting airborne pollutant concentrations and events in a commercial building using low-cost pollutant sensors and machine learning: A case study

https://doi.org/10.1016/j.buildenv.2022.108833

Mohammadshirazi, Ahmad; Kalkhorani, Vahid Ahmadi; Humes, Joseph; Speno, Benjamin; Rike, Juliette; Ramnath, Rajiv; Clark, Jordan D. (April 2022, Building and Environment)

Full Text Available

Search for: All records